mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

نویسندگان

  • Jakub Konecný
  • Jie Liu
  • Peter Richtárik
  • Martin Takác
چکیده

We propose a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent applied to the problem of minimizing a strongly convex composite function represented as the sum of an average of a large number of smooth convex functions, and simple nonsmooth convex function. Our method first performs a deterministic step (computation of the gradient of the objective function at the starting point), followed by a large number of stochastic steps. The process is repeated a few times with the last iterate becoming the new starting point. The novelty of our method is in introduction of mini-batching into the computation of stochastic steps. In each step, instead of choosing a single function, we sample b functions, compute their gradients, and compute the direction based on this. We analyze the complexity of the method and show that the method benefits from two speedup effects. First, we prove that as long as b is below a certain threshold, we can reach predefined accuracy with less overall work than without mini-batching. Second, our mini-batching scheme admits a simple parallel implementation, and hence is suitable for further acceleration by parallelization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Proximal Gradient Descent with Acceleration Techniques

Proximal gradient descent (PGD) and stochastic proximal gradient descent (SPGD) are popular methods for solving regularized risk minimization problems in machine learning and statistics. In this paper, we propose and analyze an accelerated variant of these methods in the mini-batch setting. This method incorporates two acceleration techniques: one is Nesterov’s acceleration method, and the othe...

متن کامل

Projected Semi-Stochastic Gradient Descent Method with Mini-Batch Scheme under Weak Strong Convexity Assumption

We propose a projected semi-stochastic gradient descent method with mini-batch for improving both the theoretical complexity and practical performance of the general stochastic gradient descent method (SGD). We are able to prove linear convergence under weak strong convexity assumption. This requires no strong convexity assumption for minimizing the sum of smooth convex functions subject to a c...

متن کامل

Distributed and Scalable Variance-reduced Stochastic Gradient Descent

1) There exists a study on employing mini-batch approach on SVRG, one of the VR methods. It shows that the approach cannot scale well that there is no significant difference between using 16 threads and more[2]. This study observes the cause of the poor scalability of this existing mini-batch approach on VR method. 2) The performance of mini-batch approach on distributed setting is improved by ...

متن کامل

Accelerated Mini-batch Randomized Block Coordinate Descent Method

We consider regularized empirical risk minimization problems. In particular, we minimize the sum of a smooth empirical risk function and a nonsmooth regularization function. When the regularization function is block separable, we can solve the minimization problems in a randomized block coordinate descent (RBCD) manner. Existing RBCD methods usually decrease the objective value by exploiting th...

متن کامل

Submodular Mini-Batch Training in Generative Moment Matching Networks

Generative moment matching network (GMMN), which is based on the maximum mean discrepancy (MMD) measure, is a generative model for unsupervised learning, where the mini-batch stochastic gradient descent is applied for the update of parameters. In this work, instead of obtaining a mini-batch randomly, each mini-batch in the iterations is selected in a submodular way such that the most informativ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1410.4744  شماره 

صفحات  -

تاریخ انتشار 2014